Crash in [@ mozilla::nsDisplayItem::GetPerFrameKey]
Categories
(Core :: Layout, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox-esr128 | --- | unaffected |
firefox136 | --- | unaffected |
firefox137 | --- | unaffected |
firefox138 | + | verified |
People
(Reporter: aryx, Assigned: emilio)
References
(Blocks 2 open bugs, Regression, )
Details
(5 keywords, Whiteboard: [viewtransitions:m1], [bugmon:bisected,confirmed], [wptsync upstream])
Crash Data
Attachments
(2 files)
17 crashes from 10 installs of Firefox 138.0a1 20250325210233 on macOS and Windows.
Emilio, any idea what started this?
Crash report: https://crash-stats.mozilla.org/report/index/e98a3039-8b45-4bbb-bdb8-953cc0250326
Reason:
EXCEPTION_ACCESS_VIOLATION_READ
Top 10 frames:
0 xul.dll mozilla::nsDisplayItem::GetPerFrameKey() const layout/painting/nsDisplayList.h:2156
0 xul.dll mozilla::MergeState::ProcessItemFromNewList(mozilla::nsDisplayItem*, mozilla:... layout/painting/RetainedDisplayListBuilder.cpp:464
0 xul.dll mozilla::RetainedDisplayListBuilder::MergeDisplayLists(mozilla::nsDisplayList... layout/painting/RetainedDisplayListBuilder.cpp:835
1 xul.dll mozilla::MergeState::MergeChildLists(mozilla::nsDisplayItem*, mozilla::nsDisp... layout/painting/RetainedDisplayListBuilder.cpp:509
1 xul.dll mozilla::MergeState::ProcessItemFromNewList(mozilla::nsDisplayItem*, mozilla:... layout/painting/RetainedDisplayListBuilder.cpp:481
1 xul.dll mozilla::RetainedDisplayListBuilder::MergeDisplayLists(mozilla::nsDisplayList... layout/painting/RetainedDisplayListBuilder.cpp:835
2 xul.dll mozilla::RetainedDisplayListBuilder::AttemptPartialUpdate(unsigned int) layout/painting/RetainedDisplayListBuilder.cpp:1667
2 xul.dll nsLayoutUtils::PaintFrame(gfxContext*, nsIFrame*, nsRegion const&, unsigned i... layout/base/nsLayoutUtils.cpp:3169
3 xul.dll mozilla::PresShell::PaintInternal(nsView*, mozilla::PaintInternalFlags) layout/base/PresShell.cpp:6683
4 xul.dll nsViewManager::ProcessPendingUpdatesPaint(nsIWidget*) view/nsViewManager.cpp:399
Assignee | ||
Comment 1•2 months ago
|
||
Obvious candidate would be bug 1955697. Bug 1955697 comment 16 is what landed the riskier changes, and that seems to have made it to 20250325210233.
There's an old crash with this signature in ESR128 but it should not be related.
Took a quick look and also there's no obvious pattern in the crashes (with regard to URIs and such). So my preference would be to try to get a repro from fuzzers over the next few days and if we can't get it on time back out from beta. I expect it to be a trivial-ish change...
An alternative, also related bug could be bug 1953823 (though that would need a non-default pref to be on...)
Assignee | ||
Updated•2 months ago
|
Comment 2•2 months ago
|
||
The bug is marked as tracked for firefox138 (nightly). We have limited time to fix this, the soft freeze is in a day. However, the bug still isn't assigned.
:fgriffith, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Assignee | ||
Comment 3•2 months ago
|
||
See comment 1.
Comment 4•2 months ago
|
||
Thanks Emilio
Updated•2 months ago
|
Comment 5•2 months ago
|
||
This has been reported by live site testing and fuzzing. I will attach a reduced test case shortly.
Comment 6•2 months ago
|
||
Updated•2 months ago
|
Assignee | ||
Comment 7•2 months ago
|
||
Amazing, thanks! A pernosco recording would be awesome but otherwise I can look at this first thing tomorrow.
Comment 8•2 months ago
|
||
Verified bug as reproducible on mozilla-central 20250326192415-33af93ce40ec.
The bug appears to have been introduced in the following build range:
Start: c6ac35e2fa35adaac9047ed6123d616f6d3fdf5c (20250325081146)
End: d926f7fefc9dc9eab1d2e65a87ab2883c5ef5143 (20250325091625)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c6ac35e2fa35adaac9047ed6123d616f6d3fdf5c&tochange=d926f7fefc9dc9eab1d2e65a87ab2883c5ef5143
Comment 9•2 months ago
|
||
Painting --- after optimization:
SolidColor p=0x10b2a25c0 f=0x10b22c020(Viewport(-1))
CompositorHitTestInfo p=0x10b2a2020 f=0x10b22c1b8(ScrollContainer(html)(-1))
AsyncZoom p=0x10b2a23f0 f=0x10b22c1b8(ScrollContainer(html)(-1))
CompositorHitTestInfo p=0x10b2a20d0 f=0x10b22c0d8(Canvas(html)(-1))
SolidColor p=0x10b2a21c8 f=0x10b22c0d8(Canvas(html)(-1))
CanvasBackgroundImage p=0x10b2a22e0 f=0x10b22c0d8(Canvas(html)(-1))
Painting --- Modified list (dirty 0,0,0,0):
SolidColor p=0x10b2a2b08 f=0x10b22c020(Viewport(-1))
CompositorHitTestInfo p=0x10b2a2690 f=0x10b22c1b8(ScrollContainer(html)(-1))
AsyncZoom p=0x10b2a2990 f=0x10b22c1b8(ScrollContainer(html)(-1))
SolidColor p=0x10b2a28d0 f=0x10b22c0d8(Canvas(html)(-1))
CompositorHitTestInfo p=0x10b2a2730 f=0x10b22c0d8(Canvas(html)(-1))
CanvasBackgroundImage p=0x10b2a27d0 f=0x10b22c0d8(Canvas(html)(-1))
The SolidColor and CompositorHitTestInfo items change places. If the order of items changes then there needs to be an invalidate marking these display items as needing to be rebuilt. I would guess no such invalidate is happening. The new code probably needs to be more careful about keeping the same order.
Comment 10•2 months ago
|
||
A Pernosco session is available here: https://pernos.co/debug/h8vyySQ8G0L903cZ4Emxxw/index.html
Comment 11•2 months ago
|
||
From the pernosco, the first time drawing we add solid color item here
The second time through we add the solid color item here
I'm guessing that the value of mCSSSpecified here
is changing without an invalidate. Probably after the image is loaded and decoded we set it to false here
and it's set to true before the image is decoded. mImage.IsOpaque() starts returning true because it calls imageContainer->WillDrawOpaqueNow() which checks if there is a decoded frame, and that decode was the one that painting requested, but the notification for that decode is still pending so we haven't invalidated.
I haven't looked at the new code in detail, but can we just put the solid color item in the same place always?
Comment 12•2 months ago
|
||
Based on comment #8, this bug contains a bisection range found by bugmon. However, the Regressed by
field is still not filled.
:emilio, if possible, could you fill the Regressed by
field and investigate this regression?
For more information, please visit BugBot documentation.
Comment 13•2 months ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 10 desktop browser crashes on nightly
For more information, please visit BugBot documentation.
Assignee | ||
Comment 14•2 months ago
|
||
I haven't looked at the new code in detail, but can we just put the solid color item in the same place always?
No, because the color might need to be omitted for blending correctness. However in this case there's no blending so we should put it in the same place here...
Oh, I see what's going on, I think. That code inserts the canvas background at the bottom of the BorderBackground()
list, but that might not be right if the list wasn't empty at the beginning... Or something along those lines.
Assignee | ||
Comment 15•2 months ago
|
||
Instead of conditionally inserting the item at the bottom of the
BorderBackground() list, just suppress its color.
Assignee | ||
Updated•2 months ago
|
Comment 16•2 months ago
|
||
hoping we can land this as soon as possible since it is a top crasher and we are in soft code freeze.... so adding a NI for patch review
Comment 17•2 months ago
|
||
I didn't realize this patch was a crash fix, the commit message made it sound like a refactoring. Thanks for the needinfo, looking now.
Comment 18•2 months ago
|
||
Updated•2 months ago
|
Comment 20•2 months ago
|
||
bugherder |
Comment 21•2 months ago
|
||
Verified bug as fixed on rev mozilla-central 20250328212938-f31082d6f90f.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.
Updated•2 months ago
|
Description
•